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Abstract 

A Thyroid cancer 1 (TC-1) disordered protein overexpressed in thyroid 
carcinogenesis mainly to vertebrate, and thyroid cancer associated genes do not 
have a common homology. Its rate of deaths depends upon the type of thyroid 
cancer 1. The impact of TC-1 in papillary carcinoma showed more expression 
of thyroid cancer among other tumors. The protein was involved in various 
biological processes such as wnt/B catenin signaling pathway involved in 
thyroid cancer. The TC-1 undergoes a detailed sequence analysis that provides 
the information in conserved part region among primates, birds, rodents, and 
reptiles. Additionally, different prediction tools and software were utilized for 
the prediction of thyroid cancer | protein. The string database was used for 
protein- protein analyses that were performed with LSM1 interacting protein 
and seek the protein interacting residues by computational approach. The 
comparative docking was performed by utilizing ZINC library compound and 
generated an efficient result by selecting the compound having least binding 
affinity for the analysis of computational drug designing. 


This work is licensed under the Creative Commons Attribution Non- 
Commercial 4.0 International License. 
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Introduction 


Thyroid Cancer (TC) is the irregular growth of cells 
in thyroid glands that makes hormones to control the 
blood pressure, body temperature, heart rate, weight 
and mostly the malignancy of the endocrine system of 
head and neck. TC is classified into four different 
types. The papillary thyroid cancer (PTC), arise from 
follicular cells, which develop, and store thyroid 
hormones and 80% thyroid cancer fall in PTC. 
Follicular thyroid cancer (FTC) arises from the 
follicular cells of the thyroid. Medullary thyroid 
cancer (MTC) initiates in the thyroid cells known as C 
cells that produce the hormone calcitonin and 
anaplastic thyroid cancer (ATC) is a rare type of 
thyroid cancer that begins in the follicular cells. 
Thyroid cancer is known to be the 7“ common most 
cancer among the females, meanwhile the 14" most 
common cancer in males [1, 2]. According to the 
world cancer statistics report of 2018, 0.4% death rate 
has been raised due to TC [3, 4]. There are 
environmental and genetic risk factors observed such 
as goiter in any family history, emissive radiation 
exposure and certain hereditary syndrome. The key 
cause of TC is still not known [5]. 

The transcriptional expression level and an immune 
response regulator (TC-/ or TCIM or CSorf4) are 
involved in the proliferation of TC. There is no 
homology sequences of TC-/ located at chromosome 
8 (8p11.21) [6]. Thus, several studies reveal that the 
TC has association with a related Wnht/B-catenin 
signaling pathway as f-catenin and axin play the 
curial involvement for TC mutations. The WNT- 
CTNNBI pathway enhances the CBY1 activity [7, 8]. 
The protein is used to intensify the follicular dendritic 
cell proliferation [9]. Mitogen-activated MAPK2/3 
also plays by this protein for signaling the pathway, 
regulates the transition of the cell cycle from G1 to S 
phase [10]. The promoter of cell proliferation in 
cancer as PTC or lung cancer, Gl-to-S-phase 
transition and inhibitor of apoptosis [11]. 

The TC-1 has 106 amino acids, which activate the 
proteosomal degradation rapidly. It stimulates the 
expression by heart shock, certain cell stresses, as well 
as pro-inflammatory cytokine [12]. Recently, 
Parkinson’s disease 1s executed by genome-wide 
association study (GWAS), considering an abnormal 
regulation of blood vessels in brain [13]. The TC] is 
associated with breast cancer, gastric carcinoma and 
hypertrichosis universalis congenita. The protein 
function plays a role as constructive regulator in 
Wnt/B-catenin signaling pathway to develop the 
process and cause tumor formation through mis- 
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regulation. Chibby (Cby), a crystal structure of 
haptocorrin in complex with Cbi binds with B-catenin 
target TC-/ are known to engage the antagonistic 
behavior of cancer [14]. TC-1 binds to Cby which 
leads to regulate many associated cancers. The 
cellular evidence the protein mainly localized to the 
nucleoplasm, to the plasma layer as well as cytosol. 
The personalized medicine and computational drug 
designing from the last decade, have numerous 
possibilities to understand the cancer diseases that 
play an important role in medical field [15, 16]. 
Various biological problems have been demonstrated 
by applying different approaches of bioinformatics 
collaborating with © structural bioinformatics 
contributing a crucial role for the cancer drug 
discoveries, and mutational analyses [17, 18]. The 
work aims to predict, evaluate and validate of the 3D 
structure of TC-1 by virtual screening and protein- 
protein interactional studies [19, 20]. 


Materials and Methods 


The TC-1 protein has not contained any isoform, the 
amino acid sequence of TC-1 was retrieved from 
Uniport Knowledge Base having accession number 
QONROO. In the recent work, different computational 
approaches were performed including sequence 
analyses, 3D structure prediction, virtual screening 
and molecular docking analyses. 

The genome databases ENSEMBL 
(http://asia.ensembl.org/index.html) [21] and UCSC 
Genome browsers (https://genome.ucsc.edu/) [22] 
were used for the analyses TC-1 sequence location on 
a chromosome. The composition properties of amino 
acid sequence were analyses by COILS [23], 
ProtParam, and ProtScale [24]. To analyses disorder 
tendency of protein sequence is examine by PONDER 
tool [25] and alignment is generated by Clustal Omega 
[26], BLAT and PredictProtein [27]. 

The sequence of TC-1 was retrieved from Uniprot KB 
(https://www.uniprot.org/) [28] and for the 
identification of suitable template against the query 
sequence subjected to BIASTp. For the modelling 
MODELLER 9.20 [29] was employed to predict 3D 
structure by spatial restraints fulfilling. The online 
tools as IntFOLD [30], RaptorX [31], CPHModel 
[32], EsyPred3D [33], HHpred [34], Phyre2 [35], 
Robetta [36], SWISS-MODEL [37], I-TASSER [38], 
SPARKS-X [39], M4t [40], MOD-WEB [41] and 3D- 
JigSaw [42] were employed for protein structure 
prediction. For visualization of protein 3D structure, 
UCSF Chimera 1.13 [43] and Pymol [44] Software 
were used. The minimization of the predicted 
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structure was performed by UCSF Chimera 1.13. 
MolProbity [45] online server was used to evaluate the 
predicted structure. Various evaluation tools including 
Rampage [46], ProCheck [47], Anolea [48], Verify 
3D [49], and Errat [50] were utilized to determine the 
protein structure quality. 

To determine functional partners of the target protein, 
STITCH (Search Tool for InTeracting CHemical) 
[Sljand STRING (Search Tool for the Retrieval of 
Interacting Genes/Protein) [52] databases were 
employed for TC-1. The crystal structure of C8orf4 
(PDB ID: 4m75) was retrieved from PDB. The online 
server Patch Dock [53] was used for the docking 
interaction and Fire Dock [54] was employed to refine 
the analyses. Gramm-X [55] online server was also 
utilized for the analyses of protein-protein docking 
studies. The hydrophobic and electrostatic 
interactions were analyzed through LigPlot [56, 57]. 
The molecular docking analyses were performed by 
using Pyrx [58] by optimizing AutoDock. The blind 
docking analyses were carried out to analyze the 
interactions between the protein and ligands for the 
orientation and conformation. FDA library of was 
extracted from the zinc database [59] and virtually 
screened against the target protein and these screened 
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molecules are further used for the drug designing [60, 
61]. 


Results and Discussion 


The study of structural bioinformatics is the field of 
exploring knowledge and providing research in an 
efficient way for the better understanding and 
development of different research approaches that 
helps us for the detection of disease along with the 
treatment procedures and medicines. The TC-1 
protein showed over expression in PTC as compared 
to other types of TC. 


Sequence Analyses 


The location of the gene position was determined by 
using ENSEMBLE and UCSC genome browser 
databases as chromosome & having position 
8p11.21contain the gene of 7C-/ on the forward stand 
of open reading frame 4 which contain 1829 base pair 
nucleotides (Fig. 1). 





Fig. 1: The presence of gene TC-1 on the position of chromosome number 8 (8p11.21) upon the forward stand that 


contain 1829 bp. 


The multiple sequence alignment (MSA) was 
performed by Clustal Omega to evaluate the protein 
similarities among the family in which ‘*’ show the 
identical residues, “:” describe the similar residues 
among all three-protein family of TC-1 (Fig. 2). 

COILS, ProtParam and ProtScale tools were utilized 
to calculate the physiochemical properties as 
molecular weight of the protein was on the average 
isotope masses of amino acid and the average isotope 


sp|Q9NRGO| TCIM_HUMAN 
sp|Q9D915|TCIM_MOUSE 
sp|QSE969/ TCIM_BOVIN 


z=si2s ss s 


Sp | Q9NRGO|TCIM_HUMAN 
sp|Q9D915|TCIM_MOUSE 
sp|QSE969/| TCIM_BOVIN 


mass of one water molecule. The extinction 
coefficient will be 10% chance of error as the 
sequence does not contain tryptophan (W). 
Theoretical pI depends on the side chain which plays 
an important role to determine the pH of the protein. 
The half-life of the protein was observed 30 hours in 
vitro. The number were negative atoms as well as 
positive charges residues along with total number of 
atoms and aliphatic index were calculated (Fig. 3). 


MKAKRSHQAVIMSTSLRVSPSIHGYHFDTASRKKAVGNIF ENTDQESLERLFRNSGDKKA 
MKAKPSHOATSMSSSLRVSPSIHGYHFDTAARKKAVGNIF ENIDQESLORLFRNSGDKKA 
MKAKPSHPAFSMSTSLRVSPSIHGYHFDTASRKKAVGNIF ENIDQEALORLFRNSGDKKA 


EBVe SHVSSSESSSSSESESESS - SESRESSTEVTESSE =x ° x e 2 2 2 2 te SS SF 


EERAKIIFAIDOQDOVEEKTRALMALKKRTKDKLFOFLKLRKYSIKVH 186 
EERAKIIFAIDODOLEEKTRALMALKKRTKDKLLOFLKLRKYSIKVH 186 
EERAKIIFAIDQOLEEKTRALMALKKRTKDKLFQFLKLRKYSIKVH 186 


KRSARSRARSKRSKBSEKK +s SHAK SAABRAKSEBARASKSR + SKATES ES 





Fig. 2: Alignment retrieved from the Clustal Omega of the related protein of mouse and bovine with a human which 


shows residues with *(identical) and :(somewhat similar). 
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Fig. 3: Pie chart representation of composition amino acid (a.a) of TC-1 and calculated percentage values. 


The neural network-based protein disorder was 
predicted in the region of TC-1 cancer reveal that 65 
residues were involved in disorder region and 
approximately 61 percent of the total protein. The 
PONDR tool was trained to detect the disorder 
sequence to determine the sequence mutations. TC-1 
has two chains and one have a long stretch of 52 
residues (38-89). The graph showed that the order and 
disorder composition as middle single line was 
threshold and above the line lies in the disorder region 
sequence. Therefore, TC-1 was predicted as a natively 
disordered protein (Fig. 4). 

Structure Prediction 

3D structure of TC-1 (TCIM; C8orf4) was not 
reported by NMR and X-ray crystallography 
techniques till now. To predict the 3D structure, 
comparative and threading approaches were used. The 
sequence was submitted to BLASTp against PDB to 
retrieve suitable templates. The top-ranked five 





templates having maximum identity, E-value, query 
coverage and total scores were observed for homology 
modeling. All the scrutinized templates were utilized 
to generate 3D structure of TC-1 (TCIM; C8orf4). The 
total query coverage along with similarity for the used 
template against TC-1 showed >70% from end to end 
for homology modeling analyses. 

Several models were generated by utilizing various 
tools (Robetta, Phy2, M4t, HHpred, SWISS MODEL, 
SPARKS-X, Raptor-X, PSIPRED, IntFOLD, Mod 
Web, 3D-jigsaw, I-TASSER, and MODELLER 2.0) 
as in silico approaches (threading and comparative 
modeling) to predict the structures. 

All these generated models were evaluated based on 
quality factor, favored region, allowed region and 
outliers. The graphs were generated comparatively for 
all the predicted models from the homology and 
threading approaches and the reliable structure was 
selected from the generated graphs (Fig. 5). 
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Fig. 4: Peak graph shows the disorder of residues score TC-1 protein, X-axis values are the number of residues in 
the graphs and the Y-axis elaborate about the order and disorder scores. 


The overall quality factor showed of TC-1 was 
93.62% accurate evaluated from the ERRAT. 
Ramachandran plot were utilized for the evaluation of 
predicted model which reveals the ọ and y distributed 
along with the information of residues of favored 
region lie 98%, allowed region residues consist of 
99% of the total sequence and only one residue exist 
in outlier region Val 105. The minimization of the 
selected structure was applied for the improvement of 
stereochemistry and considered the model for the most 
optimal purpose. The most optimal structure was 
minimized at UCSF Chimera 1.13 on 1000 steepest 
and conjugates gradients runs after the critical 
examination at evaluation parameters (Fig. 6). 


Protein-Protein Interactions 


The TC-1 expressed in lungs, thyroid, and nuclear 
expression in several tissues, mostly in placenta. The 
crystal structure of LSM1 complex was the interacting 
partner of TC-1, was used for the interaction of 
protein- protein docking studies. The protein-protein 
docking analyses were performed and determined by 
using GrammX online server. The interacting residues 
of the complex were analyzed through UCSF Chimera 
1.13 (Fig. 7). The interacting residues of receptor 


protein and ligand protein were analyzed (Table 1) 
[62]. 

Molecular docking analyses 

The molecular docking experiments revealed different 
binding energies and complexes were generated. The 
least binding energy complex was determined by 
analyzing the least binding energy and was selected 
for further analyses. The structure accuracy was 
determined though docking which was employed by 
PyRx. The Zinc library compound ZINC00010 
showed least binding energy of -8.5 Kcal/mol (Table 
2). The 2D structure (Fig. 8) of the selected compound 
was minimized through ChemDraw Ultra 8.0 [63]. The 
3D structure analyses of molecular docking were 
analyzed through Chimera 1.13 (Fig. 9). 

The specificity of functional proteins related to its 
structure is involved in cellular processes as they are 
molecule of life. Bioinformatics field in which various 
disciplines including computing, bioinformatics, 
mathematics, artificial intelligence, chemistry and 
statistical approaches were covered to facilitate 
discovery of new biological ideas [64]. Structural 
bioinformatics field has undergone many 
improvements over the last 10 years. Computational 
recourses increase in biological data and methodology 
develop the size and resolution of study as well as 
created complex question to research 
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Fig. 5: Graph of Quality Factor, Favored Region, Allowed Region and Outliers region of the TC-1 Protein structure 
prediction analysis extracted from the different modeling tools and software 


[65-67]. The study of protein and information related 
to protein open many questions that are related to 
health of an organism. The approaches of 
computational analyses lower the time phases and 
very useful to the researcher in the field of 
research[68-70]. By using in silico methods and 
computational approaches the protein structure of TC- 
1 was predicted. 


Conclusion 


Computational bioinformatics analysis on TC-1 
protein that causes thyroid cancer in the human 
forecast the 3D structure of the protein sequence. The 
docking approaches were implemented for protein- 
protein interaction as well as to seek out the ligand- 
based docking analysis with the thyroid cancer 1 
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protein. These docking analyses will be utilized for 
computational drug designing and development. 
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Fig. 6: 3D-Structure image predicted from the modeling tool (Robetta) declared the optimal structure of TC-1 protein. 
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Fig. 7: Protein-protein docking structures. 


Tablel: Residues of protein-protein docking as receptor protein and ligand protein. 





Target Protein Target Protein Interacting Interacting Protein residues 
residues rotein 

Thyroid LYS 4, GLN 8, LSM1 ASP 32, LEU 33, TYR 34, LEU 35, ASP 36, GLN 37, TRY 38, ASN 39, PHE 40, THR 

Cancer-1 VAL 10, SER 15, 41, THR 42, THR 43, ALA 44, ALA 45, ILE 46, VAL 47, SER 48, SER 49, VAL, ASP 
SER 21, GLY 24, 51, ARG 52, LYS 53, ILE 54, PHE 55, VAL 56, LEU 57, LEU 58, ARG 59, ASP 60, 
ARG 32, ALA GLY 61, ARG 62, LEU 64, PHE 65, GLY 66, VAL 67, LEU 68, ARG 69, THR 70, PHE 
35, ASN 38, ILE 71, ASP 72, GLN 73, TYR74, ALA 75, ASN 76, LEU 77, LEU 78, LEU 79, GLN 80, 
39, GLN 94 ASP 81, CYS 82, VAL 83, GLU 84, ARG 85, ILE 86, TYR 87, PHE 88, SER 89, GLU 


90, GLU 91, ASN 92, LYS 93, TYR 94, ALA 95, GLU 96, GLU 97, ASP 98, ARG 99, 
GLY 100, ILE 101, PHE 102, ILE 104, ARG 105, GLY 106, GLU 107, ASN 108, VAL 
109, VAL 110, LEU 112, GLY 113, GLU 114, VAL 115, ASP 116, ILE 117, ASP 118, 
LYS 119, GLU 120, ASP 121, GLN 122, PRO 123, LEU 124, GLU 128, ARG 129, ILE 
130, PRO 131, PHE 132, LYS 133, GLU 134, ALA 135, TRP 136, LEU 137, THR 138, 
LYS 139, GLN 140, LYS 141, ASN 142, ASP 143, GLU 144, LYS 145, ARG 146, PHE 
147, LYS 148, GLU 149, GLU 150, THR 151, HIS 152, LYS 153, GLY 154, LYS 155, 
LYS 156, ALA 158, ARG 159, HIS 160, ILE 162, VAL 163, TYR 164, ASP 165, PHE 
166, HIS 167, LYS 168, SER 169, ASP 170 


Table 2: Top four least binding energy compounds from the molecular docking experiment. 


Name of compound Binding affinity kcal/mol RMSD/ upper binding RMSD/ lower binding 
ZINC00010 -8.5 3.37 11.229 

ZINC00130 -8.2 0 0 

ZINC00131 -8.2 0 0 

ZINC00093 -7.9 0 0 
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Figure 8: 2D-Structure of least binding affinity structure compound. 


Fig. 9: Molecular Docking analyses of least binding energy compound through virtual screening. 
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